Goto

Collaborating Authors

 conversational system


Therapeutic AI and the Hidden Risks of Over-Disclosure: An Embedded AI-Literacy Framework for Mental Health Privacy

Anvari, Soraya S., Wehbe, Rina R.

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly deployed in mental health contexts, from structured therapeutic support tools to informal chat-based well-being assistants. While these systems increase accessibility, scalability, and personalization, their integration into mental health care brings privacy and safety challenges that have not been well-examined. Unlike traditional clinical interactions, LLM-mediated therapy often lacks a clear structure for what information is collected, how it is processed, and how it is stored or reused. Users without clinical guidance may over-disclose personal information, which is sometimes irrelevant to their presenting concern, due to misplaced trust, lack of awareness of data risks, or the conversational design of the system. This overexposure raises privacy concerns and also increases the potential for LLM bias, misinterpretation, and long-term data misuse. We propose a framework embedding Artificial Intelligence (AI) literacy interventions directly into mental health conversational systems, and outline a study plan to evaluate their impact on disclosure safety, trust, and user experience.


Few-Shot Query Intent Detection via Relation-Aware Prompt Learning

Zhang, Liang, Li, Yuan, Zhang, Shijie, Zhang, Zheng, Li, Xitong

arXiv.org Artificial Intelligence

Abstract--Intent detection is a crucial component of modern conversational systems, since accurately identifying user intent at the beginning of a conversation is essential for generating effective responses. Currently, most of the intent detectors can only work effectively with the assumption that high-quality labeled data is available (i.e., the collected data is labeled by domain experts). T o ease this process, recent efforts have focused on studying this problem under a more challenging few-shot scenario. These approaches primarily leverage large-scale unlabeled dialogue text corpora to pretrain language models through various pretext tasks, followed by fine-tuning for intent detection with very limited annotations. Despite the improvements achieved, existing methods have predominantly focused on textual data, neglecting to effectively capture the crucial structural information inherent in conversational systems, such as the query-query relation and query-answer relation. Specifically, the query-query relation captures the semantic relevance between two queries within the same session, reflecting the user's refinement of her request, while the query-answer relation represents the conversational agent's clarification and response to a user query . T o address this gap, we propose SAID, a novel framework that integrates both textual and relational structure information in a unified manner for model pretraining for the first time. Firstly, we introduce a relation-aware prompt module, which employs learnable relation tokens as soft prompts, enabling the model to learn shared knowledge across multiple relations and become explicitly aware of how to interpret query text within the context of these relations. Secondly, we reformulate the few-shot intent detection problem using prompt learning by creating a new intent-specific relation-aware prompt, which incorporates intent-specific relation tokens alongside the semantic information embedded in intent names, helping the pretrained model effectively transfer the pretrained knowledge acquired from related relational perspectives. Building on this framework, we further propose a novel mechanism, the query-adaptive attention network (QueryAdapt), which operates at the relation token level by generating intent-specific relation tokens from well-learned query-query and query-answer relations explicitly, enabling more fine-grained knowledge transfer. Extensive experimental results on two real-world datasets demonstrate that SAID significantly outperforms state-of-the-art methods, achieving improvements of up to 27% in the 3-shot setting. When equipped with the relation token-level QueryAdapt module, it yields additional performance gains of up to 21% in the same setting.


PERCY: Personal Emotional Robotic Conversational System

Meng, Zhijin, Althubyani, Mohammed, Xie, Shengyuan, Razzak, Imran, Sandoval, Eduardo B., Bamdad, Mahdi, Cruz, Francisco

arXiv.org Artificial Intelligence

Traditional rule-based conversational robots, constrained by predefined scripts and static response mappings, fundamentally lack adaptability for personalized, long-term human interaction. While Large Language Models (LLMs) like GPT-4 have revolutionized conversational AI through open-domain capabilities, current social robots implementing LLMs still lack emotional awareness and continuous personalization. This dual limitation hinders their ability to sustain engagement across multiple interaction sessions. We bridge this gap with PERCY (Personal Emotional Robotic Conversational sYstem), a system designed to enable open-domain, multi-turn dialogues by dynamically analyzing users' real-time facial expressions and vocabulary to tailor responses based on their emotional state. Built on a ROS-based multimodal framework, PERCY integrates a fine-tuned GPT-4 reasoning engine, combining textual sentiment analysis with visual emotional cues to accurately assess and respond to user emotions. We evaluated PERCY's performance through various dialogue quality metrics, showing strong coherence, relevance, and diversity. Human evaluations revealed PERCY's superior personalization and comparable naturalness to other models. This work highlights the potential for integrating advanced multimodal perception and personalization in social robot dialogue systems.


The Three Social Dimensions of Chatbot Technology

Figueroa-Torres, Mauricio

arXiv.org Artificial Intelligence

The development and deployment of chatbot technology, while spanning decades and employing different techniques, require innovative frameworks to understand and interrogate their functionality and implications. A mere technocentric account of the evolution of chatbot technology does not fully illuminate how conversational systems are embedded in societal dynamics. This study presents a structured examination of chatbots across three societal dimensions, highlighting their roles as objects of scientific research, commercial instruments, and agents of intimate interaction. Through furnishing a dimensional framework for the evolution of conversational systems, from laboratories to marketplaces to private lives, this article contributes to the wider scholarly inquiry of chatbot technology and its impact in lived human experiences and dynamics.


Adaptive Retrieval-Augmented Generation for Conversational Systems

Wang, Xi, Sen, Procheta, Li, Ruizhe, Yilmaz, Emine

arXiv.org Artificial Intelligence

Despite the success of integrating large language models into the development of conversational systems, many studies have shown the effectiveness of retrieving and augmenting external knowledge for informative responses. Hence, many existing studies commonly assume the always need for Retrieval Augmented Generation (RAG) in a conversational system without explicit control. This raises a research question about such a necessity. In this study, we propose to investigate the need for each turn of system response to be augmented with external knowledge. In particular, by leveraging human judgements on the binary choice of adaptive augmentation, we develop RAGate, a gating model, which models conversation context and relevant inputs to predict if a conversational system requires RAG for improved responses. We conduct extensive experiments on devising and applying RAGate to conversational models and well-rounded analyses of different conversational scenarios. Our experimental results and analysis indicate the effective application of RAGate in RAG-based conversational systems in identifying system responses for appropriate RAG with high-quality responses and a high generation confidence. This study also identifies the correlation between the generation's confidence level and the relevance of the augmented knowledge.


Interpretable User Satisfaction Estimation for Conversational Systems with Large Language Models

Lin, Ying-Chun, Neville, Jennifer, Stokes, Jack W., Yang, Longqi, Safavi, Tara, Wan, Mengting, Counts, Scott, Suri, Siddharth, Andersen, Reid, Xu, Xiaofeng, Gupta, Deepak, Jauhar, Sujay Kumar, Song, Xia, Buscher, Georg, Tiwary, Saurabh, Hecht, Brent, Teevan, Jaime

arXiv.org Artificial Intelligence

Accurate and interpretable user satisfaction estimation (USE) is critical for understanding, evaluating, and continuously improving conversational systems. Users express their satisfaction or dissatisfaction with diverse conversational patterns in both general-purpose (ChatGPT and Bing Copilot) and task-oriented (customer service chatbot) conversational systems. Existing approaches based on featurized ML models or text embeddings fall short in extracting generalizable patterns and are hard to interpret. In this work, we show that LLMs can extract interpretable signals of user satisfaction from their natural language utterances more effectively than embedding-based approaches. Moreover, an LLM can be tailored for USE via an iterative prompting framework using supervision from labeled examples. The resulting method, Supervised Prompting for User satisfaction Rubrics (SPUR), not only has higher accuracy but is more interpretable as it scores user satisfaction via learned rubrics with a detailed breakdown.


Integrating Large Language Models with Multimodal Virtual Reality Interfaces to Support Collaborative Human-Robot Construction Work

Park, Somin, Menassa, Carol C., Kamat, Vineet R.

arXiv.org Artificial Intelligence

In the construction industry, where work environments are complex, unstructured and often dangerous, the implementation of Human-Robot Collaboration (HRC) is emerging as a promising advancement. This underlines the critical need for intuitive communication interfaces that enable construction workers to collaborate seamlessly with robotic assistants. This study introduces a conversational Virtual Reality (VR) interface integrating multimodal interaction to enhance intuitive communication between construction workers and robots. By integrating voice and controller inputs with the Robot Operating System (ROS), Building Information Modeling (BIM), and a game engine featuring a chat interface powered by a Large Language Model (LLM), the proposed system enables intuitive and precise interaction within a VR setting. Evaluated by twelve construction workers through a drywall installation case study, the proposed system demonstrated its low workload and high usability with succinct command inputs. The proposed multimodal interaction system suggests that such technological integration can substantially advance the integration of robotic assistants in the construction industry.


Dynamic Contexts for Generating Suggestion Questions in RAG Based Conversational Systems

Tayal, Anuja, Tyagi, Aman

arXiv.org Artificial Intelligence

When interacting with Retrieval-Augmented Generation (RAG)-based conversational agents, the users must carefully craft their queries to be understood correctly. Yet, understanding the system's capabilities can be challenging for the users, leading to ambiguous questions that necessitate further clarification. This work aims to bridge the gap by developing a suggestion question generator. To generate suggestion questions, our approach involves utilizing dynamic context, which includes both dynamic few-shot examples and dynamically retrieved contexts. Through experiments, we show that the dynamic contexts approach can generate better suggestion questions as compared to other prompting approaches.


"You tell me": A Dataset of GPT-4-Based Behaviour Change Support Conversations

Meyer, Selina, Elsweiler, David

arXiv.org Artificial Intelligence

Conversational agents are increasingly used to address emotional needs on top of information needs. One use case of increasing interest are counselling-style mental health and behaviour change interventions, with large language model (LLM)-based approaches becoming more popular. Research in this context so far has been largely system-focused, foregoing the aspect of user behaviour and the impact this can have on LLM-generated texts. To address this issue, we share a dataset containing text-based user interactions related to behaviour change with two GPT-4-based conversational agents collected in a preregistered user study. This dataset includes conversation data, user language analysis, perception measures, and user feedback for LLM-generated turns, and can offer valuable insights to inform the design of such systems based on real interactions.


Comprehensive Benchmarking of Entropy and Margin Based Scoring Metrics for Data Selection

Sabbineni, Anusha, Anand, Nikhil, Minakova, Maria

arXiv.org Artificial Intelligence

While data selection methods have been studied extensively in active learning, data pruning, and data augmentation settings, there is little evidence for the efficacy of these methods in industry scale settings, particularly in low-resource languages. Our work presents ways of assessing prospective training examples in those settings for their "usefulness" or "difficulty". We also demonstrate how these measures can be used in selecting important examples for training supervised machine learning models. We primarily experiment with entropy and Error L2-Norm (EL2N) scores. We use these metrics to curate high quality datasets from a large pool of \textit{Weak Signal Labeled} data, which assigns no-defect high confidence hypotheses during inference as ground truth labels. We then conduct training data augmentation experiments using these de-identified datasets and demonstrate that score-based selection can result in a 2% decrease in semantic error rate and 4%-7% decrease in domain classification error rate when compared to the baseline technique of random selection.